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Abstract 

The total atomization energy at absolute zero, (TAEo) of benzene, CgE^, 
was computed fully ab initio by means of W2h theory as 1306.6 kcal/mol, 
to be compared with the experimentally derived value 1305. 7±0. 7 kcal/mol. 
The computed result includes contributions from inner-shell correlation (7.1 
kcal/mol), scalar relativistic effects (-1.0 kcal/mol), atomic spin-orbit split- 
ting (-0.5 kcal/mol), and the anharmonic zero-point vibrational energy (62.1 
kcal/mol). The largest-scale calculations involved are CCSD/cc-pV5Z and 
CCSD(T)/cc-pVQZ; basis set extrapolations account for 6.3 kcal/mol of the 
final result. Performance of more approximate methods has been analyzed. 
Our results suggest that, even for systems the size of benzene, chemically accu- 
rate molecular atomization energies can be obtained from fully first-principles 
calculations, without resorting to corrections or parameters derived from ex- 
periment. 
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Computational thermochemistry is coming of age as part of the chemist's toolbox [|TJ. 
Popular approaches (such as G3 theory [Q] and CBS-QB3 ||) that can lay claim to 'chemical 
accuracy' (1 kcal/mol) on average for small systems, invariably rely on a combination of 
relatively low-level ab initio calculations and sophisticated empirical correction schemes, 
which have been parametrized against experimental data. 

In recent years, a number of groups have focused on obtaining accurate thermodynamic 
data of small molecules by means of fully ab initio approaches (i.e. devoid of parameters 
derived from experiment); the reader is referred to studies by e.g. Dixon @JI|, Klopper ||, 
Bauschlicher j7|, and Martin ||. Very recently, we developed two near-black box methods of 
this type, known as Wl and W2 theory (for Weizmann-1 and -2, respectively); in the original 
paper || and a subsequent validation study |T(J for most of the G2/97 data set fTTl|T2 



we 



have shown that these methods yield thermochemical data in the kJ/mol accuracy range for 
small systems that are well described by a single reference configuration. 

The question arises as to how well such methods would 'scale up' to larger systems. For 
this purpose, the ubiquitous benzene molecule would appear to offer an excellent 'stress 
test'. It has six heavy atoms, yet its heat of formation is known precisely from experiment, 
and its high symmetry makes it amenable to fairly large-scale treatments with modern high- 
performance computing hardware. In the present note, we shall discuss the performance the 
total atomization energy (TAE e if zero-point exclusive, TAE at K) of benzene of the more 
rigorous W2h theory, of the more widely applicable Wl and Wlh theories, and of a variety 
of more approximate approaches. 

All calculations involved in Wl, Wlh, and W2h theory were carried out using MOLPRO 



98.1 Jl3[ running on a Compaq ES40 minisupercomputer in our laboratory. (For the open- 
shell calculations on carbon, the definition of the CCSD(T) [nj] energy according to Ref. 



| T5| has been used.) Detailed descriptions and justifications of the various steps involved 
can be found in Refs. PJTOf. We merely note here that, for the system under study, the 
final result at the highest level of theory (W2h) consists of the following components: (a) an 
SCF limit extrapolated from SCF/cc-pVnZ (correlation consistent polarized valence n-tuple 
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zeta flEf , with n=T,Q,5) energies using the formulas E(n) = + B/C n (old style ||) 
or E{n) = Eao + A/n 5 (new style ||10[); (b) a CCSD valence correlation limit extrapolated 
from CCSD/cc-pVnZ (n=Q,5) results using E(n) = E^ + A/n 3 ; (c) a limit for the ef- 
fect of connected triple excitations extrapolated from [CCSD(T)/cc-pVnZ-CCSD/cc-pVnZ] 
(n=T,Q) using E(n) = E^ + A/n 3 ; (d) an inner-shell correlation contribution obtained at 
the CCSD(T)/MTsmall level; (e) a scalar relativistic (lst-order Darwin and mass-velocity 
|17| , [T8|| ) contribution obtained as an expectation value from the ACPF/MTsmall |19| wave 
function; (f) a first-order spin-orbit correction derived from the fine structure of the con- 
stituent atoms; and (g) the anharmonic zero-point energy (vide infra). The computationally 
most intensive step was the CCSD/cc-pV5Z calculation. At 876 basis functions, with 30 
electrons correlated, this could not be carried out using a conventional algorithm even while 
exploiting the D 2 h subgroup of D eh ; using the direct CCSD algorithm of Lindh, Schiitz, and 
Werner |2(J as implemented in MOLPRO, it took 14 days of CPU time on a 667 MHz Alpha 
EV67 CPU with 768 MB of memory allocated. (The CCSD(T)/cc-pVQZ optimum geometry 
required for the W2h calculations was taken from Ref. PH] .) 

The Wlh calculations primarily differ in that the extrapolations are carried out with 
smaller cc-pVnZ (n=D,T,Q) basis sets (and E(n) = E^ + A/n 3 ' 22 for the correlation steps, 
see |§ for its derivation), while in Wl theory, the carbon basis set is in addition augmented 
with diffuse functions [22|| . All relevant data for the W2h calculation are collected in Table 
|. Calculations using more approximate methods such as G3 theory and CBS-QB3 
were carried out using their respective implementations in Gaussian 98 |[23|| . 

For a molecule this size, the zero-point vibrational energy (ZPVE) is large enough that 
even fairly small relative errors may compromise the quality of the final TAE. Handy and 
coworkers |24| computed a quartic force field at the B3LYP/TZ2P p5| , |26| level; from their 
published anharmonicity constants (in particular the set deperturbed for Fermi resonances 
closer than 100 cm -1 ), we obtain an anharmonic ZPVE of 62.04 kcal/mol. At the same 
level of theory, one-half the sum of the harmonics, J^i^idi/Z (with di the degeneracy of 
mode i) comes out 0.9 kcal/mol too high at 62.96 kcal/mol, while one-half the sum of the 
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fundamentals, J2iVidi/2, comes out 1 kcal/mol too low at 60.98 kcal/mol. The average of 
both estimates, {^i + Vi)di/ 4=61.97 kcal/mol, is only 0.07 kcal/mol below the true anhar- 
monic value. From the best available computed harmonic frequencies, CCSD(T)/AN04321 
p7| and the best available experimental fundamentals ||24|| , we obtain ZPVE=62.01 kcal/mol, 
or, after correction for the difference at the B3LYP/TZ2P level between J2i {^i + v i)di/4 and 
the true anharmonic ZPVE, we find a best-estimate ZPVE=62.08 kcal/mol. 

Of the more approximate approaches used in various computational thermochemistry 
methods, HF/6-31G* harmonic frequencies scaled by 0.8929 (as used in G2 and G3 theory f2j) 
yield 60.33 kcal/mol, or about 1.7 kcal/mol too low. The procedure used in the very recent 
G3X and G3SX theories |28|, B3LYP/6-31G(2df,p) scaled by 0.9854, however reproduces 
the best estimate to within 0.1 kcal/mol. B3LYP/6-311G** harmonic frequencies scaled 
by 0.99, as used in CBS-QB3 [[|, yields 62.23 kcal/mol, in very good agreement with the 
best estimate; the HF/6-31G(d) scaled by 0.9184 estimate in CBS-Q yields 61.69 kcal/mol, 
slightly too low. Finally, B3LYP/cc-pVTZ harmonics scaled by 0.985 (as used in Wl and 
Wlh theory yield 62.04 kcal/mol, in near-perfect agreement with the best estimate. 

Relevant data for the W2h calculation are collected in Table |. At first sight, the dis- 
agreement between the W2h AHJ 0K =23.1 kcal/mol and the experimental value of 24.0±0.2 
kcal/mol seems disheartening for such a CPU-intensive calculation. (Note that it 'errs' on 
the far side of the most recent previous benchmark calculation |J, 24.7±0.3 kcal/mol, which 
used similar-sized basis sets as Wl theory.) However, the comparison with experiment is not 
entirely 'fair' since it neglects the experimental uncertainties in the atomic heats of formation 
required to convert an atomization energy into a heat of formation (or vice versa). Combining 
these with the experimental AHJ 0K leads to an experimentally derived TAEo=1305.7±0.7 
kcal/mol, where the uncertainty is dominated by six times that in the heat of vaporization 
of graphite. In other words, our calculated TAE =1306.8 kcal/mol is only 0.2 kcal/mol re- 
moved from the upper end of the experimental uncertainty interval. (After all, an error of 
0.02% seems to be a bit much to ask for.) 

Alternatively and equivalently, one could affix an uncertainty of ±0.7 kcal/mol to the 
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computed W2h Aify? 0A -=23.1±0.7 kcal/mol, where the error bar only reflects the uncer- 
tainties in the auxiliary experimental data (i.e. the heats of atomization of the elements), 
but does not include the uncertainty in the theoretical calculation itself which is harder to 
quantify. While most chemists would prefer the heat of formation, an analysis in terms of 
atomization energies is somewhat more elegant since it avoids mixing computed and observed 
data. (Unfortunately, a benchmark ab initio heat of vaporization of graphite does not appear 
to be feasible at this point in time.) 

Secondly, let us consider the 'gaps' bridged by the extrapolations. For the SCF compo- 
nent, that is a very reasonable 0.3 kcal/mol (0.03 %), but for the CCSD valence correlation 
component this rises to 5 kcal/mol (1.7 %) while for the connected triple excitations con- 
tribution it amounts to 1 kcal/mol (3.7 % — note however that a smaller basis set is being 
used than for CCSD). It is clear that the extrapolations are indispensable to obtain even a 
useful result, let alone an accurate one, even with such large basis sets. 

Inner-shell correlation, at 7 kcal/mol, is of quite nontrivial importance, but even scalar 
relativistic effects (at —1.0 kcal/mol) cannot be ignored. (The discrepancy between our scalar 



relativistic correction and the previous SCF-level calculation of Kedziora et al. [29], —1.27 
kcal/mol, is consistent with the known tendency P, PD| , |3T| of SCF-level scalar relativistic 
corrections to be overestimated by 20-25%.) And manifestly, even a 2% error in a 62 
kcal/mol zero-point vibrational energy would be unacceptable. 

Let us now consider the more approximate results. While Wlh coincidentally agrees to 
better than 0.2 kcal/mol with the W2h result, Wl deviates from the latter by 0.6 kcal/mol. 
Note however that in Wlh theory, the extrapolations bridge gaps of 0.8 (SCF), 10.1 (CCSD), 
and 2.1 (T) kcal/mol, the corresponding amounts for Wl theory being 0.7, 9.1, and 1.9 
kcal/mol, respectively. Common sense suggests that if extrapolations account for 13.0 (Wlh) 
and 11.7 (Wl) kcal/mol, then a discrepancy of 1 kcal/mol should not come as a surprise - 
in fact, the relatively good agreement between the two sets of numbers and the more rigorous 
W2h result (total extrapolation: 6.3 kcal/mol) testifies, if anything, to the robustness of the 
method. 
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As for the difference of about 0.4 kcal/mol between the old-style and new-style ]10| 



SCF extrapolations in Wlh and Wl theories, comparison with the W2h SCF limits clearly 
confirms the new-style extrapolation to be the more reliable one. (The two extrapolations 
yield basically the same result in W2h.) This should not be seen as an indication that the 
+ A/L 5 formula is somehow better founded theoretically, but rather as an example of 
why reliance on (aug-)cc-pVDZ data should be avoided if at all possible. 

Our best TAEo value (W2h) differs by 1.6 kcal/mol from the previous benchmark calcu- 
lation of Feller and Dixon ||. In fact, since their largest basis set is of AVQZ quality, the 
appropriate comparison would be with our Wl atomization energy, which is 2.3 kcal/mol 
larger than their result using RCCSD(T) atomic energies. The zero-point energy and the 
corrections for core correlation, scalar relativistic effects, and atomic spin-orbit splitting are 
all very similar in the two studies. Their extrapolation approach is very different from ours, 
but in the event this difference nearly cancels out with that caused by the different definitions 
of the RCCSD(T) energy used in the atomic calculations. (Feller and Dixon followed Ref. 
32fl , as opposed to Ref. [PJ in the present paper: we find the difference for six carbon atoms 



to be 0.52 kcal/mol at the CCSD(T)/AVQZ level.) The difference is in fact mostly due to a 
—2.1 kcal/mol correction for 'higher-order correlation effects' applied in Ref. |5|], which is an 
estimate of the CCSDT — CCSD(T) difference from small basis set calculations. However, 
the generally excellent quality of CCSD(T) computed bond energies rests to a large extent on 
an error compensation between neglect of higher-order connected triple excitations (which 
tend to reduce the binding energy) and complete neglect of quadruple excitations (which 
tend to increase it) [Q. It has been known for some time (e.g. ||34|| ) that CCSDT energies 
are not necessarily closer to full CI than CCSD(T). Consequently, an accurate treatment 
should either include both T 4 and higher-order T 3 effects where it is possible to do so, or 
neglect both: including only the higher-order T 3 of necessity leads to an underestimate of 
TAE. We do note that our respective best estimates bracket the experimental value, which 
may indicate that the 'true' (full CI) TAE lies in between. However, in view of the uncer- 
tainty on the experimental TAE and the impossibility to carry out even a highly approximate 
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CCSDTQ calculation on benzene, it is hard to make a definite statement about this. 

Turning finally to the more approximate approaches, G2 theory clearly underestimates 
TAEo: G3 represents a major improvement, but the better than 1 kcal/mol agreement 
between the G3 TAEo and the experimentally derived value in fact benefits from an error 
compensation with the underestimated ZPVE: a rather more pronounced difference is seen 
for TAE e . This problem is remedied in the very recent G3X and G3SX theories, which 
predict both TAE e and TAEo to within 1 kcal/mol of experiment, as does CBS-QB3. CBS- 
Q is slightly too low; the fairly elaborate CBS-APNO method [[35|] find results that nearly 
coincide with Wl theory. (We note that none of the Gn and CBS methods considered 
explicitly includes scalar relativistic effects; they instead rely on them being absorbed into 
the parametrization.) 

Summarizing the above, we may state the following: 

The total atomization energy of benzene, CeHg, was computed fully ab initio by means 
of W2h theory as 1306.6 kcal/mol, to be compared with the experimentally derived value 
1305. 7±0. 7 kcal/mol. The computed result includes contributions from inner-shell corre- 
lation (7.1 kcal/mol), scalar relativistic effects (-1.0 kcal/mol), atomic spin-orbit splitting 
(-0.5 kcal/mol), and the anharmonic zero-point vibrational energy (62.1 kcal/mol). The 
largest-scale calculations involved are CCSD/cc-pV5Z and CCSD(T)/cc-pVQZ; basis set ex- 
trapolations account for 6.3 kcal/mol of the final result. Performance of more approximate 
methods has been analyzed. Our results suggest that, even for systems the size of benzene, 
chemically accurate molecular atomization energies can be obtained from fully first-principles 
calculations, without resorting to corrections or parameters derived from experiment. 
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TABLES 

TABLE I. Individual components in Wlh, Wl, and W2h total atomization energy cum heat of 



formation of benzene. 
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(a) best estimate (see text). 

(b) From A#° ^[C 6 H 6 (g)]=24.0±0.12 kcal/mol [0,0, A#£ [C(g)]=169.98±0.11 kcal/mol 
|8|, and A#£ [H(g)]=51.634 kcal/mol §8|. (The uncertainty in AiZ£ [H(g)] is negligible.) 

(c) All values except G2 include corrections for atomic spin-orbit splitting. 
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